-
Notifications
You must be signed in to change notification settings - Fork 83
docs(clp-package): Rewrite S3 log compression guide to reflect new API and script features. #1510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(clp-package): Rewrite S3 log compression guide to reflect new API and script features. #1510
Conversation
WalkthroughReplaces single-mode S3 compression guidance with a two-mode Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as User/CI
participant CLI as sbin/compress-from-s3.sh
participant S3 as S3 (URL / Key-Prefix)
participant Compressor as Compression pipeline
Note over User,CLI #DDEBF7: User selects mode and supplies args
User->>CLI: invoke with mode=`s3-object` or `s3-key-prefix`\nargs: `url` / `object-key` / `inputs-from`
alt s3-object (single-object or inputs-from)
CLI->>S3: resolve object URLs (validate non-empty common prefix)
S3-->>CLI: object(s) metadata / URLs
else s3-key-prefix
CLI->>S3: list objects by single key-prefix
S3-->>CLI: object list
end
CLI->>Compressor: stream selected objects to compression pipeline
Compressor-->>User: output archive/result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (4)📓 Common learnings📚 Learning: 2025-01-16T16:58:43.190ZApplied to files:
📚 Learning: 2025-09-28T15:00:22.170ZApplied to files:
📚 Learning: 2025-06-18T20:39:05.899ZApplied to files:
🪛 LanguageTooldocs/src/user-docs/guides-using-object-storage/clp-usage.md[uncategorized] ~91-~91: Possible missing comma found. (AI_HYDRA_LEO_MISSING_COMMA) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-object-storage/clp-usage.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-docs/guides-using-object-storage/clp-usage.md
[uncategorized] ~10-~10: Loose punctuation mark.
Context: ... two modes of operation: * s3-object : Compress S3 objects specified by their ...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~11-~11: Loose punctuation mark.
Context: ...y their full S3 URLs. * s3-key-prefix : Compress all S3 objects under a given S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~80-~80: Possible missing comma found.
Context: ...is the prefix of all logs you wish to compress and must begin with the <all-logs-...
(AI_HYDRA_LEO_MISSING_COMMA)
🪛 markdownlint-cli2 (0.18.1)
docs/src/user-docs/guides-using-object-storage/clp-usage.md
88-88: Link and image reference definitions should be needed
Unused link or image reference definition: "add-iam-policy"
(MD053, link-image-reference-definitions)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (1)
docs/src/user-docs/guides-using-object-storage/clp-usage.md (1)
52-56: Verify documentation clarity for dual-mode constraints.The notes correctly document current limitations (common prefix requirement for s3-object, single URL constraint for s3-key-prefix) and indicate these may be relaxed in future releases. This is clear, though you may want to consider highlighting the common prefix validation more prominently in the s3-object section since it could cause job failures.
Confirm whether the common prefix validation requirement is prominently enough communicated to users, especially given that objects can be specified via
--inputs-fromfile (which could make the validation requirement less obvious).Also applies to: 83-86
| * `<bucket-name>` is the name of the S3 bucket containing your logs. | ||
| * `<region-code>` is the AWS region [code][aws-region-codes] for the S3 bucket containing your | ||
| logs. | ||
| * `<key-prefix>` is the prefix of all logs you wish to compress and must begin with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Add clarifying comma for readability.
A comma after "compress" improves clarity of the compound modifier.
Apply this diff to improve readability:
- * `<key-prefix>` is the prefix of all logs you wish to compress and must begin with the
+ * `<key-prefix>` is the prefix of all logs you wish to compress, and must begin with the📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * `<key-prefix>` is the prefix of all logs you wish to compress and must begin with the | |
| * `<key-prefix>` is the prefix of all logs you wish to compress, and must begin with the |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~80-~80: Possible missing comma found.
Context: ...is the prefix of all logs you wish to compress and must begin with the <all-logs-...
(AI_HYDRA_LEO_MISSING_COMMA)
🤖 Prompt for AI Agents
In docs/src/user-docs/guides-using-object-storage/clp-usage.md around line 80,
the sentence "* `<key-prefix>` is the prefix of all logs you wish to compress
and must begin with the" is missing a clarifying comma after "compress"; update
the line to insert a comma after "compress" so the compound modifier reads "logs
you wish to compress, and must begin with the" to improve readability.
| * `s3-object` : Compress S3 objects specified by their full S3 URLs. | ||
| * `s3-key-prefix` : Compress all S3 objects under a given S3 key prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `s3-object` mode requires the input object keys to share a non-empty common prefix. If the input | ||
| object keys do not share a common prefix, they will be rejected and no compression job will be | ||
| created. This limitation will be relaxed in a future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether we should mention <all-logs-prefix> as we did when discussing key-prefix compression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @LinZhihao-723, looking good, a few comments to address.
Co-authored-by: Quinn Taylor Mitchell <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few more nit comments to tighten it up, but otherwise lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
| `s3-object` mode allows you to specify individual S3 objects to compress by their full URLs. To | ||
| use this mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `s3-object` mode allows you to specify individual S3 objects to compress by their full URLs. To | |
| use this mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in | |
| The `s3-object` mode allows you to specify individual S3 objects to compress by using their full | |
| URLs. To use this mode, call the `sbin/compress-from-s3.sh` script as follows, replacing fields in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo, the change from "and replace the" to "replacing" weakens the tone of the sentence, because "replacing" sort of dangles there (even though it is technically agreeing with the subject). but neither one is wrong, so I guess it is up to personal taste.
| * `https://<bucket-name>.s3.<region-code>.amazonaws.com/<prefix>` | ||
| * `https://s3.<region-code>.amazonaws.com/<bucket-name>/<prefix>` | ||
| * The fields in `<url>` are as follows: | ||
| * `<object-url>` is a URL identifying the S3 object to compress. It can be written in either of two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * `<object-url>` is a URL identifying the S3 object to compress. It can be written in either of two | |
| * `<object-url>` is a URL identifying the S3 object to compress. It can be written in one of two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested changing to "either" because it more readily indicates that the user can write the URL in either one of the two formats. having just "one of two" implies that only one of the formats will be valid in each use case, and that the user has to figure out which one to use.
| `s3-key-prefix` mode allows you to compress all objects under a given S3 key prefix. To use this | ||
| mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in angle | ||
| brackets (`<>`) with the appropriate values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `s3-key-prefix` mode allows you to compress all objects under a given S3 key prefix. To use this | |
| mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in angle | |
| brackets (`<>`) with the appropriate values: | |
| The `s3-key-prefix` mode allows you to compress all objects under a given S3 key prefix. To use this | |
| mode, call the `sbin/compress-from-s3.sh` script as follows, replacing fields in angle brackets | |
| (`<>`) with the appropriate values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above re. "replacing"
| * `<key-prefix-url>` is a URL identifying the S3 key prefix to compress. It can be written in either | ||
| of two formats: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * `<key-prefix-url>` is a URL identifying the S3 key prefix to compress. It can be written in either | |
| of two formats: | |
| * `<key-prefix-url>` is a URL identifying the S3 key prefix to compress. It can be written in one of | |
| two formats: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above re. "either"
Co-authored-by: kirkrodrigues <[email protected]>

Description
As the title suggests.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit